Understanding Computer Architecture Classification Systems
Parallel computing has become an essential technology in modern computers, driven by the constantly increasing demands for higher performance, lower costs, and sustained productivity in real applications. There are concurrent events taking place in today's high performance computers due to the common practices of multiprogramming, multiprocessing, or multi-computing.
Parallelism can take the form of look ahead, pipelining, vectorization, concurrency, simultaneity, data parallelism, partitioning, interleaving, overlapping, multiplicity, replication, time sharing, space sharing, multitasking, multiprogramming, multithreading, and distributed computing at different processing levels.
Parallel computing is a type of computation where many calculations are performed at the same time, operating under the principle that large problems can often be divided into smaller ones, which are then solved together concurrently.
The following classifications of parallel computers have been identified:
Focuses on how instructions and data flow through the system
Examines the physical organization and architecture
Analyzes memory access patterns and organization
Considers the size of data chunks processed in parallel
This categorization of computers was initially researched and suggested by Michael Flynn in 1971. Flynn did not take into account the machine design when classifying parallel computers. Instead, he presented the ideas of instruction and data flows to categorize computers. Not all the computers classified by Flynn are parallel, but to understand parallel computers, it is essential to comprehend all kinds of Flynn's classification.
Because this classification relies on instruction and data flows, we first need to grasp how the instruction cycle functions.
The instruction cycle consists of a sequence of steps needed for the execution of an instruction in a program. A typical instruction in a program is composed of two parts: Opcode and Operand. The Operand part specifies the data on which the specified operation is to be done.
The Operand part is divided into two parts: addressing mode and the Operand. The addressing mode specifies the method of determining the addresses of the actual data on which the operation is to be performed and the operand part is used as an argument by the method in determining the actual address.
The control unit of the central processing unit (CPU) in the computer sequentially obtains instructions from the program, one instruction at a time. The obtained instruction is then interpreted by the decoder, which is part of the control unit, and the processor carries out the interpreted instructions. The outcome of the execution is briefly stored in the Memory Buffer Register (MBR), also known as the Memory Data Register.
The term 'stream' refers to a sequence or flow of either instructions or data operated on by the computer. In the complete cycle of instruction execution, a flow of instructions from main memory to the CPU is established. This flow of instructions is called instruction stream. Similarly, there is a flow of operands between processor and memory bi-directionally. This flow of operands is called data stream.
Parallel computing is a form of computation where tasks are divided into separate pieces that can be worked on at the same time. Each part is further split into a sequence of instructions. The instructions from each part are executed simultaneously on different CPUs.
Parallel systems involve the concurrent use of multiple computer resources, which can include a single computer with multiple processors, several computers linked by a network to create a parallel processing cluster, or a mix of both. Parallel systems are harder to program than single-processor computers because the architecture varies based on the parallel computer and the processes across multiple CPUs need to be coordinated and synchronized.
CPUs are at the core of parallel processing. Based on the number of instruction and data streams that can be handled at the same time, computing systems are categorized into four major types:
Single-Instruction Single-Data Stream
Single-Instruction Multiple-Data Stream
Multiple-Instruction Single-Data Stream
Multiple-Instruction Multiple-Data Stream
Traditional sequential computers are classified as SISD - [single instruction stream over single data stream] machines. Instructions in these computers are carried out one after another, but their execution phases may overlap (pipelining).
This represents computers with both vector/array processing capabilities as well as scalar hardware. There are multiple processing elements overseen by one control unit. All the processing elements get the same instruction from the control unit but work on different data sets from separate data streams.
The same data stream flows through a linear array of processors executing different instruction streams. This architecture is also known as systolic arrays for pipelined execution of specific algorithms.
Most popular model. Parallel computers are reserved for MIMD. Of the four machine models, most parallel computers constructed in the past were based on the MIMD model for general purpose computing tasks. The SIMD and MISD models are better suited for specific computations. As a result, MIMD is the most widely used model, followed by SIMD, while MISD is the least common model implemented in commercial machines.
Feng suggested the use of degree of parallelism to classify various computer architectures. Tse-yun Feng suggested the use of degree of parallelism to classify various computer architectures.
Under above classification:
One bit is processed at a time
m-bit slice is processed at a time
One word of n bit processed at a time
An array of n x m bits is processed at one time
| Mode | Computer Model | Degree of Parallelism | Example |
|---|---|---|---|
| WSBS | N = 1, M = 1 | The "MINIMA" (1, 1) | |
| WPBS | N = 1, M > 1 | (1, 256), (1, 16384), (1, 4096) | STARAN, MPP, DAP |
| WSBP | N > 1, M = 1 | (64, 1), (60, 1), (48, 1), (16/32, 1) | IBM 370/168 UP, CDC 660, Burrough 7700, VAX 11/780 |
| WPBP | N > 1, M > 1 | (64, 64) | Illiav IV |
In 1977, Wolfgang Handler proposed an elaborate notation for expressing the pipelining and parallelism of computers. Handler's classification addresses the computer at three distinct levels:
The PCU corresponds to a processor or CPU, the ALU corresponds to a functional unit or a processing element and the BLC corresponds to the logic circuit needed to perform one-bit operations in the ALU.
Handler's classification uses the following three pairs of integers to describe a computer:
Where:
The following rules and operators are used to show the relationship between various elements of the computer:
The CDC 6600 has a solitary central processor upheld by 10 I/O processors. One control unit coordinates one ALU with a 60-bit word length. The ALU has 10 functional units which can be assembled into a pipeline. The 10 peripheral I/O processors may work at the same time with one another and with the CPU. Each I/O processor contains one 12-bit ALU.
ASC has one controller coordinating four arithmetic units. Each arithmetic unit is an eight stage pipeline with 64-bit words.
Cray-1 is a 64-bit single processor computer whose ALU has twelve functional units, eight of which can be chained together from a pipeline. Different functional units have from 1 to 14 segments, which can also be pipelined.
The C.mmp was designed to facilitate research into parallel computer architectures, so it can be extensively reconfigured. The system consists of 16 PDP-11 minicomputers with 16-bit word lengths, interconnected by a crossbar switching network.
The '*' and '+' operators are utilized to bring together multiple separate hardware components. The 'v' operator differs from the other two in that it is employed to combine the various operating modes of a single hardware piece.
While Flynn's categorization is straightforward, Handler's classification is unwieldy. The direct application of numbers in the nomenclature of Handler's classification makes it much more abstract and thus difficult. Handler's classification is highly oriented towards depicting pipelines and chains. Although it can adequately illustrate the parallelism in a single processor, the diversity of parallelism in multiprocessor computers is not well addressed.